Improving XPath Query Execution in P2P XML Storage by Using a Decentralized Index

نویسندگان

  • Konstantin Pussep
  • Predrag Knežević
  • Nicolas Liebau
  • Ralf Steinmetz
چکیده

1 KOM, TU Darmstadt, Merckstrasse. 25, 64283 Darmstadt, Germany {pussep,liebau,steinmetz}@kom.tu-darmstadt.de 2 Fraunhofer IPSI, Dolivostr. 15, 64293 Darmstadt, Germany [email protected] Abstract. Today, information is managed incresingly in dynamic communities on the Internet. Here, peer-to-peer communities in which users host data by contributing their resources is a very promising alternative for centralized hosting. Managed data is often represented in XML and requires a high-level query language, where XPath is a good candidate. In this paper, we present a decentralized XML index which enables efficient XPath queries on large documents stored in p2p systems. Unlike other approaches, we do not rely on a specific overlay as our solution is able to work on top of any structured overlay which provides common put and get operations. Our approach is combined with P2P XML Storage, which stores arbitrarily large XML files in p2p networks efficiently. Evaluation has proven that our index improves the XPath query performance regarding both the execution time and the number of messages by orders of magnitude.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

VAMANA : A High Performance, Scalable and Cost Driven XPath Engine

Many applications are migrating or beginning to make use native XML data. We anticipate that queries will emerge that emphasize the structural semantics of XML query languages like XPath and XQuery. This brings a need for an efficient query engine and database management system tailored for XML data similar to traditional relational engines. While mapping large XML documents into relational dat...

متن کامل

A Two-Step Approach for Tree-structured XPath Query Reduction

XML data consists of a very flexible tree-structure which makes it difficult to support the storing and retrieving of XML data. The node numbering scheme is one of the most popular approaches to store XML in relational databases. Together with the node numbering storage scheme, structural joins can be used to efficiently process the hierarchical relationships in XML. However, in order to proces...

متن کامل

A New Query Engine using Novel Three Dimensional Index for Xml Documents

XML has gained prominence as data storage and exchange format for web applications. This is because there are certain features which are unique to XML like self descriptivism, extensibility and non proprietary text document storage. In spite of all these unique features XML has an inherent limitation of verbosity. This size problem of XML should be dealt with efficiently so that a good compress...

متن کامل

Relational Approach to Logical Query Optimization of XPath

To be able to handle the ever growing volumes of XML documents, effective and efficient data management solutions are needed. Managing XML data in a relational DBMS has great potential. Recently, effective relational storage schemes and index structures have been proposed as well as special-purpose join operators to speed up querying of XML data using XPath/XQuery. In this paper, we address the...

متن کامل

Towards Internet-Scale Cardinality Estimation of XPath Queries over Distributed XML Data

In the last decade, we have witnessed a huge success of the peerto-peer (P2P) computing model. This has lead to the development of many Internet-scale applications and systems that are used commercially. Recently, the problem of computing statistics over data in Internet-scale systems has received attention. In this paper, we discuss the problem of cardinality estimation of XPath queries over d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007